source: http://github.com/muschellij2/CMStat_2018

UK Biobank Data

  • 500,000 participants
  • 100,000 inlucded in the sub-study
  • Can infer movement with longitudinal follow-up

Data Gathered

  • Tri-axial Axtivity 100Hz over 7 days
  • Started at 10AM and ended at 10PM
  • Data measured in milli-g (1\(g\) = 9.80665 \(\frac{m}{s^2}\))
    • not counts or steps as other devices

The devil is (or can be) in the inclusion criteria

UK Biobank Data

Demographics: Lots of Non-Response

Completed: good data Completed: bad data No response Not asked
n 96701 7005 132800 266110
Age at Initial Visit (mean (sd)) 56.6 (7.8) 55.2 (7.9) 56.4 (8.0) 57.5 (8.1)
Male (% Male) 42255 (43.7) 3156 (45.1) 62601 (47.1) 121151 (45.5)
Ethnicity (% Non-White) 2983 (3.1) 335 (4.8) 7617 (5.8) 16102 (6.1)

Many people DIED before being able to be asked

Assessment to Accelerometry can be a WHILE

Responders are Healthier (Self-Reported)

Completed: good data Completed: bad data No response Not asked
Overall health (%)
Excellent 20987 (21.8) 1464 (21.0) 21583 (16.3) 37849 (14.4)
Good 57849 (60.0) 4057 (58.1) 78968 (59.7) 148196 (56.2)
Fair 15149 (15.7) 1261 (18.0) 26669 (20.2) 62313 (23.6)
Poor 2482 (2.6) 205 (2.9) 4969 (3.8) 15124 (5.7)

Accelerometry Data Available

  • Data at varyling levels
    • Axtivity CWA format (Highest resolution, 100Hz)
      • very large for 100K subjects (Estimate size)
  • 5 second level data
    • imputation/processing done
    • averaged into minute-level data
  • Overall statistics (mean/median): overall, daily, hourly, day of week
    • removed “non-wear” periods

Accelerometry Data Available

  • Data at varyling levels
    • Axtivity CWA format (Highest resolution, 100Hz)
      • very large for 100K subjects
  • 5 second level data
    • imputation/processing done
    • averaged into minute-level data
  • Overall statistics (mean/median): overall, daily, hourly, day of week
    • removed “non-wear” periods

Processing done (not by me)

Auto-calibration [@van2014autocalibration]

  • 10\(s\) window all axes SD \(< 13.0\) m\(g\).
  • fit a unit gravity sphere using OLS.
  • If 3 axes had values outside a \(\pm 300\)m\(g\) range - use calibration coefficient from that person
  • if not use next peron’s accelerometer record from the same device

Lesson #1: If magnitude is important, need calibration (“batch effect” correction)

Processing done (not by me)

  • recording errors and ‘interrupts’ flagged (plug in accelerometer)
  • \(\pm8g\) flagged
  • Resampled to 100 Hz (interrupts > 5 seconds set to missing)
  • Euclidean norm, fourth order Butterworth low pass filter (f = 20Hz).
  • Subtract \(1g\), negative values set to \(0\)

@doherty2017large

At least they have software to do this! (Written in LANGUAGE?)

Goal of analysis

  • Explore the data
  • Get similar findings to some [@doherty2017large] analysis
  • Assess “bias” in different devices (see if autocalibration is working)

How do people typically move? (First plot)

  • Average all data - regardless of mulitple visits per person

How do people typically move? (First plot)

  • Average all data - regardless of mulitple visits per person
  1. Run full - show weird shit
  2. Run only 95% data - show how some goes away
  • say maybe solved
  1. Run by day - show all 7 and the weird things
  2. Show how threshold throws out day 0 and 7 (probably good thing) - peak problem
  3. Show the jumps still
  4. Well maybe it’s the mean - just use the median - should fix it - nope
  5. Show 1-2 raw data (Need this)

How do people typically move?

  • Only keep when 95% (1368 minutes) are not missing

Maybe Lesson #2: Keep only full days

Not so fast, let’s look day by day

  • Take Day 0 - 7, day 0 is when they started wearing it
    • this is what we saw in the data though it should be 0-6
  • Calculate statistic over all participants
    • the same minute across people
    • each day separately

Randomly Sample 2000 people

Maybe it’s a “few bad apples”

We’re ready for analysis right?

  • Nope! Need to estimate “wear time”

Wear time

“We removed non-wear time, defined as consecutive stationary episodes lasting for at least 60 minutes where all three axes had a standard deviation of less than 13.0 m\(g\)”

@doherty2017large

Why 13m\(g\)?

“Here, 13 m\(g\) was selected just above the empirically derived baseline (noise) standard deviation of 10 m\(g\) to retain only nonmovement periods.” @van2014autocalibration

Takehome Messages

  1. Start off smaller than 100K people
  2. Inspect the raw(ish) data
  3. Autocalibration seems to work well on gross features (with 100K people)
  4. Artifacts still seem present in the data
  5. Back to the 100Hz data we go!

Next data installment

“We invited some participants to wear an activity monitor for a week, four times a year. … finished in early 2019.”